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For professional basketball, finding valuable and suitable play¬ 
ers is the key to building a winning team. To deal with such chal¬ 
lenges, basketball managers, scouts and coaches are increasingly turn¬ 
ing to analytics. Objective evaluation of players and teams has always 
been the top goal of basketball analytics. Typical statistical analytics 
mainly focuses on the box score and has developed various metrics. 

In spite of the more and more advanced methods, metrics built upon 
box score statistics provide limited information about how players 
interact with each other. Two players with similar box scores may 
deliver distinct team plays. Thus professional basketball scouts have 
to watch real games to evaluate players. Live scouting is effective, but 
suffers from inefficiency and subjectivity. In this paper, we go beyond 
the static box score and model basketball games as dynamic networks. 

The proposed Continuous-time Stochastic Block Model clusters the 
players according to their playing style and performance. The model 
provides cluster-specific estimates of the effectiveness of players at 
scoring, rebounding, stealing, etc, and also captures player interac¬ 
tion patterns within and between clusters. By clustering similar play¬ 
ers together, the model can help basketball scouts to narrow down 
the search space. Moreover, the model is able to reveal the subtle dif¬ 
ferences in the offensive strategies of different teams. An application 
to NBA basketball games illustrates the performance of the model. 


1. Introduction. For decades, basketball data analysis has gained enor¬ 
mous attention from basketball professionals and basketball enthusiasts from 
various fields. The top goal has always been to better understand how play¬ 
ers and teams play, and conduct evaluations more efficiently and objectively. 
Over the last few years, the explosion of available data, the growth of com¬ 
puter power and the developments of statistical models have made complex 
modeling of basketball data possible. A revolution is happening in the field 
of basketball data analysis. 

The traditional approaches focus on the box score, which lists the statis¬ 
tics of players and teams of each game, for example, number of field goals 
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attempted, field goals made, rebounds, blocks, steals, plus-minus(+/—), and 
other snapshot statistics. By combining the box score statistics, empirically 
or through regression analysis, various metrics have been developed to eval¬ 
uate player and team performances (Oliver, 2004; Shea and Baker, 2013). 
However, “there is no Holy Grail of player statistics” (Oliver, 2004). As 
pointed out by Shea and Baker (2013), the metrics are either “bottom up” or 
“top down”. Bottom-up metrics mostly focus on the individual performance, 
whereas top-down metrics put emphasis at the team level. Traditional box 
score metrics mostly fail to take into account two important factors of bas¬ 
ketball: the interaction of players and the fact that a basketball play is a 
real-time process. 

Recently, researchers have started to investigate basketball games from 
these two perspectives. By treating player positions (point guard, shooting 
guard, small forward, power forward and center) as network nodes and ball 
passes as network edges, Fewell et al. (2012) advocate “Basketball is not a 
game, but a network”. They illustrate ball transition patterns of different 
teams by their basketball networks. Additionally, they quantitatively ana¬ 
lyze basketball games and teams by calculating network properties such as 
degree centrality, clustering coefficient, network entropy and flow central¬ 
ity. However, when building the networks, Fewell et al. (2012) only consider 
the cumulative passes of games. Hence, the networks are not able to cap¬ 
ture details of basketball plays. Neither can they describe players’ individual 
performances. In 2013, the National Basketball Association(NBA) installed 
optical tracking systems (SportVU technology) in all thirty courts to collect 
real-time data. The tracking system records the spatial position of the ball 
and the positions of all players on the court at any time of the game. It also 
records all actions of the games. Using such comprehensive data, Cervone 
et al. (2016) model the evolution of a basketball play as a complex sto¬ 
chastic process. Their model reveals both offensive and defensive strategies 
of players and teams. Ultimately, the model estimates the expected scores an 
offensive team can make at any time of the play. The two approaches above 
certainly provide more insights and more accurate evaluations of players, 
teams and basketball plays. 

In the NBA, teams obtain new players through trades, free agency and the 
annual draft. There are so many potential players, especially college players, 
that no scout is able to keep close track on all of them. Clustering players 
to a number of groups, according to their performances and playing styles, 
can efficiently narrow down the target space. When searching for players, 
basketball managers, scouts and coaches always hope that the new player 
can quickly fit in the current team. Therefore, how players interact with 
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teammates is of great importance. This must be taken into account during 
the clustering procedure. 

In this paper, we propose a Continuous-time Stochastic Block Model 
(CSBM) to address the problem of player clustering. We model basketball 
games as transactional networks and a basketball play as an inhomogeneous 
continuous-time Markov chain. The CSBM clusters the players according to 
their performances on the court. It also effectively reveals the players’ play 
styles and the teams’ offensive strategies. 

The remainder of the paper is organized as follows. In Section 2, we 
present our view of basketball games as transactional networks and show 
the data format. In Section 3, we introduce the standard Stochastic Block 
Model and construct the Continuous-time Stochastic Block Model. An EM 
algorithm and a complementary algorithm are developed in Section 4. We 
illustrate our model by an application to NBA basketball games in Section 
5. In the end, we summarize our contributions. 

2. Basketball Networks. We begin with a brief introduction to some 
typical networks, before moving to basketball networks. 

A static network is a graph G = {V, E), consisting of a set of vertices (or 
nodes) V and a set of edges E. A network with n vertices can be represented 
by an n X n adjacency matrix, A = [Ajj], where Aij = 0 or 1 indicates the 
absence or presence of the i ^ j edge. Figure 1 shows a simple undirected 
network {Aij = Aji for all i and j) with four vertices and four edges. The 



A = 


0 111 
10 10 
110 0 
1 0 0 0 


Fig 1. An undirected static network (left) and its adjacency matrix (right) 

relations between pairs of nodes do not have to be binary-valued. When 
entries of the adjacency matrix take values other than 0 or 1, the network 
is called weighted or multi-edged. 
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Under certain circumstances, instead of observing an edge between two 
nodes, we observe a series of transactions, for example, phone calls among 
a number of people in a period of time. Such networks are transactional 
networks. The corresponding data, as shown in Table 1, simply records the 
senders, the recipients and the time of transactions. 


Table 1 

A transactional network 


Prom 

To 

Time of transaction 

1 

4 

03/29/2015, 08:27 

1 

7 

03/29/2015, 09:01 

3 

1 

03/30/2015, 17:11 


We now look at basketball, a team game. Players pass the ball to each 
other and form networks, with players as vertices and passing as transactions 
on edges. A basketball game is made of basketball plays. Generally, a basket¬ 
ball play starts with inbounding, rebounding, or stealing the ball. During a 
play, the team with the ball plays offense and the other team plays defense. 
A play ends when the offensive team shoots the ball (scores or misses but 
the ball hits the rim), makes a turnover, or the offensive player is fouled 
when shooting the ball, etc. In the NBA, the time limit for one play is 24 
seconds. Figure 2 illustrates one basketball play. 

Inbound r i j j r Score 2 

I-1-1-1-1-1-1-> 

0 h ts ••• G T 24 

Fig 2. A basketball play. The ball is inbounded to player r at time 0; r passes the ball to 
i at time t\; i passes the ball to j at time t 2 ; ...; player i receives the ball at time tm-i 
and passes it to r at time tm,; the play ends when player r scores 2 points at time T < 24 
seeonds. 


In a 48-minute NBA game, a team obtains about 90-110 plays. Fewell 
et al. (2012) model basketball games as weighted networks by counting the 
frequencies of ball transitions among the starts/ends of plays and the five 
positions of basketball players (point guard, shooting guard, small forward, 
power forward and center). Figure 3, which is taken from Fewell et al. (2012), 
displays the overall weighted network of 16 NBA games between 16 teams 
they have studied. The network illustrates play patterns and strategies on 
a game level. Fewell et al. (2012) compare the teams by investigating their 
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I Shooting Foul, 2 Pt. Succes^ 


I Shooting Foul, 3 Pt- Success 
I Turnover"! 

Out of Bounds I 


I Defensive Foul 
I Offensive Foul 1 
I snot Violation I 


I Shooting Foul. 2 Pt Fail 


[ Shooting Foul. 3 R. Fail | 


2 Point Success | 


3 Point Success 


3 Point Fair] 


Fig 3. Weighted basketball network of 16 NBA games between 16 teams (Fewell et al., 
2012). Circles represent the five positions (point guard, shooting guard, small forward, 
power forward, and center), and rectangles represent start or end points of a play. The 
width of the edge is proportional to the frequency of the corresponding ball transitions. The 
most frequent transition directions, which sum up to 60%, are colored red. 


networks. However, such network can not capture any detail of real-time 
basketball play. 

We explore basketball at the play level and take into account time effect. 
More specifically, we regard basketball as a transactional network. Table 2 
illustrates our data, from games 1 and 5 of the 2012 NBA eastern confer¬ 
ence finals between the Miami Heat and the Boston Celtics. We manually 
collected the data by watching the videos of the games. 

In a basketball game, only ten players, five from each team, are on the 
court at one time. This means a basketball game is subject to many player 
substitutions. The last column of Table 2 records the players from the offen¬ 
sive team who are on the court at the events. Such information is necessary 
for our model. Note that the player inbounding the ball is treated as being 
off the court at the time of that event. For example, in Table 2, Gffb is 
inbounding the ball and not listed as being on the court. 

As indicated earlier and shown in Figure 3, there are various ways to start 
and end a play. A play mostly starts with one of the three initial actions: 
inbounding, rebounding and stealing the ball. However, a play technically 
may end with about fifteen different outcomes. For simplicity, we combine 
the outcomes to six categories: making a 2-pointer (Make 2), making a 3- 
pointer (Make 3), missing a 2-pointer (Miss 2), missing a 3-pointer (Miss 
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Table 2 

Two plays from game 1 of the 2012 NBA eastern conference finals between the Boston 
Celtics and the Miami Heat. The top three lines show one play for the Boston Celtics. 
The ball is inbounded to C#9 (Rajon Rondo) at time 0; Rondo dribbles the ball and 
passes it to C#5 (Kevin Garnett) at second 11; Garnett misses a 2-pointer shot at 
second 12. Lines 4 to 9 illustrate one play for the Miami Heat. 


From 

To 

Time(s) 

Players on the court 

Inbound 

C#9 

0 

C#9, C#20, C#30, C#34 

C#9 

C#5 

11 

C#5, C#9, C#20, C#30, C#34 

C#5 

Miss 2 

12 

C#5, C#9, C#20, C#30, C#34 

Rebound 

H#6 

0 

H#3, H#6, H#15, H#21, H#31 

H#6 

H#3 

7 

H#3, H#6, H#15, H#21, H#31 

H#3 

H#15 

8 

H#3, H#6, H#15, H#21, H#31 

H#15 

H#3 

9 

H#3, H#6, H#15, H#21, H#31 

H#3 

H#6 

12 

H#3, H#6, H#15, H#21, H#31 

H#6 

Miss 3 

17 

H#3, H#6, H#15, H#21, H#31 


3), being fouled (Fouled) and making a turnover (TO). Scoring and being 
fouled at the same time is simply counted as scoring. Catching an air ball 
is counted as rebounding. All possible ways of giving up the possession of 
the ball such as direct turnover, being out of bound and offensive foul are 
regarded as turnover. We do not consider rare events such as a jump ball. 
We simply discard the rows corresponding to the rare events. 

Although we group events into plays in Table 2, the model developed later 
in Section 3 will treat each event as an individual occurrence, ignoring which 
play it belongs to. That is, the data in Table 2 will be seen as 9 isolated 
events (each with a timestamp), rather than 3 events in one play and 6 
events in another play. 

3. Models. Our goal is to model the basketball network and cluster 
players into different groups, so that players in the same group have similar 
playing styles, while those in different groups play the game in more distinct 
ways. We propose a Continuous-time Stochastic Block Model. The main idea 
is to adopt the Stochastic Block Model framework and model basketball 
plays as Markov Chains. 

3.1. Stochastic Block Models. The Stochastic Block Model (SBM) (Sni- 
jders and Nowicki, 1997; Holland, Laskey and Leinhardt, 1983; Wang and 
Wong, 1987) is an important framework for model-based community de¬ 
tection in static networks. Many recent works have generalized the model 
(Airoldi et ah, 2008; Karrer and Newman, 2011) and explored its theoretical 





7 


properties (Bickel and Chen, 2009; Rohe, Chatterjee and Yu, 2011; Zhao, 
Levina and Zhu, 2012; Choi, Wolfe and Airoldi, 2012). The standard SBM 
assumes that each node belongs to an underlying block or community. Nodes 
in the same block are stochastically equivalent. The distribution of an edge 
between two nodes is governed by the blocks to which they belong. More¬ 
over, given the block affiliations of the nodes, all edges are conditionally 
independent. 

Mathematically, recall that a network with n vertices can be represented 
by an n X n adjacency matrix, A = [Aij], where Aij = 0 or 1 respectively 
indicates the absence or presence of the edge, i ^ j. The SBM specihes that, 
given K blocks and the block labels of all the nodes, e = {ei,e 2 ,.. • ,e„}, 
where Cj G {l,2,...,iL}, the conditional distribution of these Aij's has the 
form 


£(A|e) = Y[ ViAij\ei,ej), 


( 1 ) 




where 'P{Aij\ei,ej) is the conditional probability that there is an edge from 
i to j given their block labels and ej, typically modeled by a Bernoulli 
distribution, i.e.. 



( 2 ) 


where {Pki : fc, / = 1, 2,... ,K} are the parameters of the model. 

Given a network and a fixed number of blocks, K, the best label con¬ 
figuration e can be obtained by maximizing the profile likelihood function 
(Bickel and Chen, 2009). However, hnding the optimal solution is NP-hard. 
Heuristic algorithms are available (Bickel and Chen, 2009; Karrer and New¬ 
man, 2011; Zhao, Levina and Zhu, 2012). The model can also be htted with 
an EM algorithm (Snijders and Nowicki, 1997). 

3.2. A Continuous-time Stochastic Block Model. We generalize the stan¬ 
dard SBM to a Continuous-time SBM for basketball networks. During a 
basketball play (Figure 2), an initial action (e.g. inbounding) first trans¬ 
fers the ball to a player; the ball then moves among the players; finally, 
a play outcome is reached (e.g. the attacking team scores a 2-pointer). 
Hence, the ball moves among three types of nodes (see Section 2): a set 
of nodes S = {inbounding, rebounding, stealing} that designate different 
initial states, a total of n nodes that are players themselves, and a set 
of nodes A = {Make 2, Miss 2, Make 3, Miss 3, Fouled, TO} that designate 
different outcomes. In addition, we assume that there are K blocks, and 


each player only belongs to one block. The initial actions and the play out¬ 
comes are observable, but the blocks to which the players belong are not. 
Again, denote the block labels of the players by e = {ei, 62 ,..., e„}, where 
ei G {1,2,... These block labels are latent. Following the conditional 
independence assumption of the SBM, the transactions among the nodes are 
independent given the block labels of the players. The conditional distribu¬ 
tion for the entire basketball network, which includes all basketball plays, 
can be written as: 

(3) £(T|e) = 


n 

nn^'(Ts.ie) 




n 

n n ^''(T*aie) 

i=l 




-i=l aeA 


where denotes the transactions from an initial action s to player i; Tjj 
denotes the transactions from player i to player j; and Tia denotes the trans¬ 
actions from player i to an outcome a. The conditional distribution (3) con¬ 
tains three natural components: the distribution of all transactions from 

initial actions to players; , the distribution of all passes among players; 
and j the distribution of all transactions from players to play outcomes. 
In the following subsections, we specify the details of these components one 
by one. 

3.2.1. Transactions from initial actions to players. Define P = {Psk ■ 
s £ S;k = 1,2,... ,K}, where each is the probability that the basketball 
moves from initial action s to a player in block k. These probabilities are 
subject to the constraint that 

K 

(4) E Psk = 1, for any s G 5. 

k=l 

Given the block labels of all players, e = {ei, 62 ,..., 6 ^}, the distribution of 
the transactions from initial action s to players i is defined as 

rrisi 1 

(5) £'(T,.ie)=nw«'^). 

L 1 

h=l 

where rugi is the total number of times that a play goes from initial action s to 
player i. The quantity, denotes the total number of “eligible receivers” 
belonging to block Ci for this particular play (from s to i), where “eligible 
receivers” are those players (including i here) who are on i’s team and also 
physically on the basketball court (as opposed to sitting on the bench) at 
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the time that a transaction takes place from initial action s to player 
i. In general, we use the notation to indicate the number of “eligible 
receivers” in block k at the time of an event indexed by A. Quantities of 
this kind will appear a few more times in the next few sections. 

The definition (5) implies that players in the same cluster are stochasti¬ 
cally equivalent. The probability that player i receives the ball from an initial 
action s is governed by the block-level probability and individual-level 
probability where we have assumed that all eligible receivers in the 

same cluster have an equal chance to receive the ball. The individual-level 
probability is needed in addition to the block-level probability because there 
is only one ball at all times and only one player can receive it. 

Recall that we consider three initial actions; inbounding, rebounding and 
stealing. While rebounding and stealing both guarantee a new play, inbound¬ 
ing can start a new play or happen in the middle of a play. For example, a 
team may call a time-out in the middle of a play, and the play is resumed 
from the stoppage time by inbounding the ball. Another common situation 
is when an offensive player is fouled without being awarded free throws, the 
play is paused and resumed by inbounding the ball. We treat all inbound¬ 
ing events as initial actions and account for them in this part {C^) of the 
probability distribution. 

3.2.2. Transactions among players. Intuitively, in a basketball play, what 
happens next mostly depends on the current situation, e.g., who has the 
ball at the moment, which players are on the court, and so on. Therefore, 
we model each basketball play as an inhomogeneous Markov chain. Players 
are treated as regular states; initial actions are treated as initial states; and 
play outcomes are modeled as absorbing states. We discussed transactions 
from initial states to regular states in Section 3.2.1. In this section, we focus 
on the regular states and construct — the second com¬ 

ponent in (3), the conditional distribution of transactions among players, 
given the cluster labels e. 

Inhomogeneous Poisson process. Before doing so, we digress momentarily 
to look at the distribution of a inhomogeneous Poisson process, often used in 
event history analysis (Cook and Lawless, 2007). Figure 4 shows a Poisson 
process with m events, happening at times ti < ■ ■ ■ < tm over the interval 
[tQ,tm]- Suppose that our observation of the process stops at time t„i. Let 
p{t) denote the rate function of this inhomogeneous Poisson process. The 
distribution is of the form (Cook and Lawless, 2007, p. 30) 

m m 

C = Y[ 'C((fi-i, L]) = n • exp (- / /9(u)du)). 

i=l i=l '^h-l 


( 6 ) 


Fig 4. A Poisson process 


The time intervals t*], i = 1, 2,..., m} are independent. For each time 

interval the distribution consists of two parts: the part for the actual 

event, and part for the the time gap between events, exp (— p{u)du) 
The derivation of (6), especially showing why the part for the time gap has 
this particular form, is given in Appendix A; it can also be found in Cook 
and Lawless (2007). 

Components of C^{Tij\e). We now derive £^(Tjj|e), the conditional dis¬ 
tribution of transactions from player ito j. To start, we revisit the basketball 
play shown in Figure 2 and isolate the segments related to the i —>• j process. 
For simplicity, suppose that player j is on the court during the entire play. 
As shown in Figure 5, player i first receives the ball at time ti and passes it 
to player j at time t 2 , so the time period (ti, t 2 ] clearly belongs to the i ^ j 
process. Next, player i gains possession of the ball again at time tm-i and 
the ball is passed to player r / j at time tm- Although player i does not 
make this pass to player j, he has the potential to do so. Hence, the time 
period {tm-i,tm) is also related to the i ^ j process. In fact, aside from the 
time point t 2 itself, there is no difference between the segments {tm-i,tm) 
and {ti,t 2 ) in terms of being part of the i ^ j process — as long as i has 
possession of the ball, the segment is related to the * —j process, regardless 
of whether i actually passes the ball to j or not at the end of the segment. In 
Figure 5, the segments related to the i —>■ j process are highlighted by solid 
points and segments. Any solid point indicates an actual pass going from i 
to j. Any solid segment means that, during that time period, an i-to-j pass 
has the potential to happen. 

Inbound r j j j r Score 2 

I-1-•--H- 1 -r-h-> 

0 b t2 ••• t„ T 24 

Fig 5. Segments of a play that are related to the i ^ j process. The i ^ j process consists 
of the solid point and the solid segments. 


Given the cluster labels e, we model each i ^ j process as pieces of a 
Poisson process. In addition, since each play is independent of one another, 
we can pool together all the “solid segments” and “solid points” (again. 
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see Figure 5) from different plays. Instead of scalar parameters for the 
standard SBM, for the continuous-time SBM we now have rate functions, 
{Pkiit) ■ k,l = 1,2,..., K}, where each pki{t) is the rate that the ball moves 
from a player in cluster A: to a player in cluster I at time t. By equation (6), 
the distribution of transactions from i to j is 


(7) = 

-rriij ^ 

n {^Peiej{tijh) ■ —[Jh) 
L/t=l 

•' -V- 

£-Pi(T,,|e) 

where 


M, 

n 

h=l 


i-c 


jih 

Pete At) • 


£-P2(Ti,|e) 


• rriij is the total number of passes from i to j] 

• tijh is the time of the pass from i to 

• is the number of “eligible receivers” belonging to block ej for 
the pass between i and j, with “eligible receivers” being those 
players (excluding i here) who are on Fs team and also physically on 
the basketball court at the time of this pass; 

• Mi is the total number of times that player i has possession of the ball; 

• tih) is the time interval in which player i has possession of the 
ball; 

• Gg^ is the number of “eligible receivers” belonging to block ej for the 

pass from player i (regardless of whether j is the recipient or not); 

and 

• the indicator is dehned as 


jih 


1, if player j is an “eligible receiver” for the pass from i; 
0, otherwise. 


Note that the quantities, G*^ and are both constant on any interval 
{t~f^, tih], since the rules of the game prevent player substitutions during any 
such time interval. In addition, we have defined 


rriij . 

( 8 ) C^^{Tij\e) = [peiejAijh) ■ -Zhjh) 

h=l tjej 

but written for the second component (rather than G^^) because it 
can be simplified further (more details below) and this here is not the final 
expression we shall use. 
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In (7), the first term contains information about all passes from i to j, 
and the second term contains the information that i does not make a pass 
to j during all those time gaps in which i has possession of the ball. The 
overall rate function for the i ^ j process consists of two distinctive parts. 
First, the rate function pe^ej{t) captures the rate of passing the ball at a 
cluster level. Second, similar to the fraction in (5), the fractions. 



and 



are the probabilities that player j is the actual receiver of the ball in group 
Cj. As in Section 3.2.1, we have assumed that all eligible receivers in the 
same cluster have an equal chance to receive the ball. 

Notice that, if player j is off the court for a particular pass from i or 
if j is on the opponent team playing against i, then the fraction is 

automatically 0 by the definition of In this way, time intervals (t^, tih) in 
which j is not an “eligible receiver” do not contribute to the i ^ j process, 
as one intuitively would expect. Furthermore, if = 0, it means there 
is no “eligible receiver” in block Cj — this can only happen if player j is 
not eligible itself, i.e., when = 0, because otherwise is at least one 
since player j (always) belongs to block ej. We define 0/0 = 0. Finally, all 
time points, {tijh : = l,2,...,n;/i = l,2,...,mij} and {t~f^,tih : i = 

1, 2,..., n; /i = 1,2,..., Mj}, take values on the interval [0, 24] (see Section 

2 ). 

Further simplification of . So far, we have derived the (conditional) 
distribution of transactions from player i to player j, C^{Tij\e). The con¬ 
ditional independence assumption means the (conditional) distribution of 
transactions between all pairs of players is simply 


n /:^(T,,|e) = 









The second term above can be simplified further. In particular. 


(9) n 


Mi 

n n®^p(“/ Petejit)- 




nnnexp(-/ Pe,e,{t)- 

i=l h=l j^i ^ih 
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n Mi 

n Hexp 

i=l h=l 
n Mi 

n Hexp 
2=1 h=l 


n Mi 

n Hexp 

i=l h=l 


ih j^i ^ ' 


C^ih 


lt~ 


K / rih 

7EE [PeAi)-^\dt 

h 1=1 j^i \ '-'I 


C^ih 


K 


*-ih 1=1 


rih 


_ J2(peAi)-J2-^A* 


Notice that, on the set Cj = I, whenever GA = 0 (i.e., nobody in block I is 
an eligible receiver), we must have = 0 as well (i.e., player j cannot be 
an eligible receiver, either, since ej = I means player j belongs to block 1). 
Therefore, 

jih 

E ^ = .'(Gf > 0). 

ej =l 

Continuing with (9), this means 


(10) n ^^^(Ti,|e) = 

n Mi 

n Hexp 

i=l h=l 


E ■'(Gp > 0))dt 


‘■ih 1=1 


CP2{T,\e) 


Decomposition o/£^(Tjj|e). Putting all the pieces together, the condi¬ 
tional distribution of all transactions among players, given the block labels, 
is of the form 


( 11 ) 




n 



l<2^j<n 


_2=1 


The first component, contains information about all passes 

from i to j. The second component, Y\2=i {Ti\e), contains information 
about all the time gaps in which player i has possession of the ball — 
although, admittedly, denoting all these time gaps here by Tj is a slight 
abuse of notation. 
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In equation (10), the indicator I{G\^ > 0) is important for two reasons. 
First, if node i is the only member in group I or if group I is empty, then it 
is impossible for i to pass the ball to group I, so intuitively the rate function 
Peii{t) should not contribute any information to this part of the probability 
distribution. Indeed, in either situation, we have Gf^ = 0, and this indicator 
effectively “annihilates” the contribution of pep- Second, we can see from (10) 
that, overall, player i has a rate of {Peii{t)' I{G\^ > 0)) to pass the ball 
at time t. Given pki{t), when there are fewer groups for player i to pass the 
ball to, its overall rate of passing the ball is automatically reduced by this 
indicator, which agrees with our intuition about how basketball games are 
played. 

3.2.3. Transactions from players to play outcomes. The play outcomes 
are modeled as absorbing states of the Markov chain. Given a set A of 
different play outcomes, we define additional rate functions {pkai^) '■ k = 
1,2,K-,a G M}, where rjkait) is the rate that a play goes from group k 
to absorbing state a at time t. 

Whenever player i has possession of the ball, there exists a possibility 
that the ball is “passed” to an absorbing state, a. Analogous to (7), the 
distribution of transactions from player i to an absorbing state a can be 
written as 

( 12 ) C°{TiaW) = 

where rriia is the total number of times that the ball goes from node i to 
absorbing state a; and tiah is the time of the event from i to a — except 
that we need no longer multiply the rate function rjeia{-) by an additional 
individual-level probability (such as 1/Gg“^), since there aren’t multiple op¬ 
tions within an absorbing state as there can be multiple players in a cluster. 
As in (7), the first term contains information about the event times, and 
the second term contains the information that player i does not “cause” the 
play to end in absorbing state a while in possession of the ball. 

Even though being fouled does not always end a play, we still consider 
being fouled as an “outcome” and take account of all fouls in this part (T^) 
of the probability distribution. 

3.2.4. A Markov chain. Here is a brief recapitulation of how we have 
modeled basketball networks (Sections 3.2.1, 3.2.2 and 3.2.3) conditional 
on the cluster labels of the players. There are three types of nodes in the 
network: special nodes that designate initial actions, regular nodes that are 


'rriia 

Peia{tiah} 

.h=l 


Mi 

nexp(-/_ r]eia{t)dt 
h=l ^ih 
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players themselves, and terminal nodes that designate play outcomes. If we 
isolate any two regular nodes, or a regular node and a terminal node, trans¬ 
actions between those two nodes have been modelled as an inhomogeneous 
Poisson process. Each basketball play, however, will consist of a sequence 
of transactions — typically starting from a special node, travelling across 
multiple regular nodes, and ending in a terminal node. Each play is thus 
an inhomogeneous, continuous-time Markov chain, of which the players are 
regular states and outcomes are absorbing states. 

3.2.5. Nonparametric modeling of rate functions. We model the rate 
functions nonparametrically by cubic B-splines: 

p 

(13) pki{t) = '^ef^>’‘^Bp{t), for k,l = 1,2,... ,K, 

p=i 

p 

(14) Pkait) = X! for k = 1,2,... ,K and a e A, 

p=i 

where {Bi{t), B 2 {t),..., Bp{t)} are basis functions; and /3 = {Pkip ■ k,l = 

1.2.. ..,K;p = 1,2,...,P}, -0 = {'ipkap ■ k = l,2,...,K-,a G A;p = 

1.2.. ..,P} are coefficients. We use exponentiated coefficients, e^^'-p and 
e'^^°-p, to ensure that all rate functions are nonnegative. 

3.3. Related Models. Vu et al. (2011) also adopted event history mod¬ 
els to deal with transactional networks, but they did not consider block 
structures. There are also a number of studies about transactional networks 
in the framework of SBMs. For example, Shafiei and Chipman (2010) fo¬ 
cused on the number of transactions, but did not consider the time factor. 
Ho, Song and Xing (2011), and Xu and Hero (2014) studied networks at 
discrete time points and used State Space Models to describe intertempo¬ 
ral dynamics. DuBois, Butts and Smyth (2013) had some ideas similar to 
ours; they focused on generic transactional networks and parameterized the 
rate/intensity function using a linear model of various network statistics, 
but their model could not be applied directly to basketball networks. 

4. An EM"*" Algorithm. Since the cluster labels, e = (ei,e 2 ,... ,6^), 
are unknown, we introduce latent variables and adopt the Expectation- 
Maximization (EM) algorithm to fit the Continuous-time SBM. Due to the 
complexity of the model, we have found in our experience that the EM al¬ 
gorithm alone can sometimes be trapped in various local optima. Running 
the EM algorithm with many random starting points helps, but it is quite 
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inefficient. Instead, we have added a complementary heuristic algorithm to 
run after the EM algorithm. We refer to the complementary algorithm as 
the “Plus algorithm” and call our overall algorithm an “EM+ algorithm”. 
Empirically, we have found that the EM"*" algorithm often reaches a nice 
optimal point with fewer starting points than does the EM algorithm itself. 


4.1. EM Algorithm. Let 2 * = (zji, Zi 2 ,..., ZiK) denote a latent label in¬ 
dicator for node i, such that 


(15) 



if node i belongs to cluster k] 
otherwise. 


Marginally, 


Zl,Z2, ■■■,Zn 


iid 

r\j 


multinomial(l, tt), 


where tt = (vri, ..., ttk)- 


We shall use 0 = to denote all parameters, and Z = {zj : i = 

l,2,...,n} to denote all latent indicators. The complete likelihood of the 
Continuous-time SBM is simply the joint distribution of (T, Z) viewed as a 
function of 0. To simplify our notation as well as to make more direct refer¬ 
ences to the models we described in Section 3, in this section we will often 
suppress 0 and still write £(T, Z) instead of £(0;T,Z) for the likelihood 
function. Hence, the complete likelihood is 


(16) £(T,Z) =£(T|Z) •£(Z). 

The conditional likelihood £(T|Z) is simply a latent-variable-coded version 
of £(T|e) (3), that is, 

(17) £(T|Z) 


r n n 

nn^'(Tsiiz) 




r n 1 

n n ^""(T^aiz) 

■ s^Si=\ 




^1=1 a&A 

-* 

n 

nn^"(Tsiiz) 



•n£^HTiiz) 


■sSiS*=l 



i- 

= 1 ^ 

r ^ 



n n ^""(T^alZ) 


^ i=l a&A 

where the second step above is due to (11). More specifically, the components 
of (17) are simply latent-variable versions of (5), (8), (10) and (12): 

(18) £'(T..iz) = n 

k=l h=l 
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K K r'^i. 


(19) £^^(T,,iz) = nn n( Pklitijh) ■ „ijh 

k=ll=l ^/i=l 

K . Mi K t-h 

(20) £^^(T,|Z) = n ne^p(-E/_' /’fcz(t)-/(Gf >0)dt 


k=l h=l 
K r rnu 


/ = 1 ^ ^ih 


M, 

(21) £^(Tia|Z) = n n Vkaitiah) • n ( 

k=l h=l h=l 

The marginal likelihood of Z is 


C^ih 


Vka 


(t)df) 


“I ^ik 


^ik 


n K 


( 22 ) 


£(z)=n n 


TT 


^ik 


i=l k=l 


4.1.1. E-step. In the E-step, we compute E( log£(T, Z)|T; 0*), the con¬ 
ditional expectation of the log-likelihood given the observed network T under 
the current parameter estimates (denoted by 0*). The conditional expecta¬ 
tion is with respect to the latent variables Z. Erom (18)-(21) it is clear 
(details in the Appendix B.l) that there are three types of conditional ex¬ 
pectations to evaluate: 

• E(zjfc|T;0*), from log£^(Tsj|Z), log£‘^(Tja|Z) and log£(Z), respec¬ 
tively; 

• Y:i{zikZji\T]Q*), from log(Tjj|Z); and 

. E(zifc • I{Gf > O)|T;0*), from log£^HT 2 |Z). 

After taking logarithms, the terms involving 1/G|*^ and in (18) and 

(19) are additive “constants” that depend only on the latent variables Z but 
contain no information about the parameters 0; they can be omitted for the 
EM algorithm. The quantity 

et = E ■ If) 

and hence the indicator I{Gf^ > 0) are both functions of the latent vari¬ 
ables. Here, we see more clearly why the further simplification of — 
equation (10) — is useful. Due to the interactions of the players, the latent 
variables are conditionally dependent and an exact calculation of the con¬ 
ditional expectations above is NP-hard. Eor instance, in order to calculate 
E{zik\T] 0*), one needs to marginalize the cluster labels over all nodes that 
interact with i. We use a Gibbs sampler to draw samples from £(Z|T; 0*), 
and use the corresponding sample means to approximate E( 2 :jfc|T; 0*), 
E{zikZji\T-e*) and E{zik-I{Gf > O)|T;0*). 
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Gibbs sampler. Let Z * = {zj : j ^ i} denote the latent cluster indicators 
of all players other than i. The idea of the Gibbs sampler is to draw 

zi ~ £(zi|Z"\T;0*), 

Z2 ~ £(z2|Z-2,T;0*), 

z„ ~ £(z„|Z-",T;0*), 

zi ~ £(zi|Z-\T;0*), 

Z2 ~ £(z2|Z-2,T;0*), 


repeatedly until the stationary distribution is reached. (In our application, 
a handful of repetitions are often sufficient.) Under the current parameter 
estimate 0*, the conditional distribution of Zi given Z“* and T is 


(23) 


£(zi|Z-*,T;0*) 


£(T,Z;0*) 

E.,/:(T,Z;0*)’ 


a multinomial distribution which is easy to sample from. More explicitly, 
suppose that, at the current step, Zjc^ = 1 for j i — this means ej = Cj 
for all j ^ i or that Cj is the current group label for player j. Then, the 
conditional probability of player i belonging to cluster k is 


(24) V{zik = l\'L-\T-Q*) 

= V[ei = k\{ej = cr-3^i],T-e*) 

_ , /c, , Cfi)j 0 ) 

Eiii 'C(T, e = (ci, C 2 ,..., Ci-i,l, Ci+i,Cn); 0*) 


4.1.2. M-step. In the M-step, we update the parameters 0 by maximiz¬ 
ing E( log£(T, Z)|T; 0*). We have closed-form solutions for tt, the marginal 
probabilities of Z, and for P, the transition probabilities from initial states: 


(25) 


(26) 


'^k — 


Psk = 


Ef=iE(z,fc|T;0*) 

Er=iE^iE(z,z|T;0*) 

j:?=i[msiB{zik\T-,e*)] 


Ef=iE(zifc|T;0* 


n 


Ef=iEr=iK.E(z,fc|T;0*)]’ 
for k = 1,2,..., K and s € S; detailed derivations are given in Appendix B.2. 
However, there are no closed-form solutions for (3 and ^p, the (log)-coefficients 
for the rate functions. We use the quasi-Newton method with L-BFGS-B up¬ 
dates — more specifically, we use the optim function in R and supply with 
it the analytic form of the gradient. 
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4.1.3. Remarks. Here, we make a few important remarks about the EM 
algorithm. The conditional probabilities driving the Gibbs sampler turn out 
to be fairly close to 0 or 1, that is, in equation (24), one of the K terms 
being summed in the denominator is significantly larger than the others. 
The reason is that each player is involved in many transactions. As far 
as the likelihood function is concerned, these transactions act as if they 
were repeated measurements, which reinforce the assignment of the player 
to a particular group. The Gibbs sampler thus converges very quickly to a 
singular probability mass. This essentially reduces the EM algorithm to a 
ilT-means algorithm: the E-step re-assigns the players to different groups, 
and the M-step re-estimates the parameters. Overall, the EM algorithm 
converges in just a few iterations. But the EM algorithm can sometimes be 
trapped in a local optimum. The typical way to avoid these traps is to use 
different starting points, run the EM algorithm for a few times, and pick 
the one giving the largest likelihood value. This “standard” procedure alone 
could be quite inefficient. Instead, we introduce another heuristic algorithm, 
which we refer to as the Plus algorithm (Section 4.2), as a complement to 
the EM algorithm. Sometimes, e.g., when the EM solution is already quite 
good, the Plus algorithm may not find any further improvement. 

4.2. The Plus Algorithm. This algorithm is inspired by the heuristic 
algorithm used by Karrer and Newman (2011) for the so-called degree- 
corrected SBM. The main idea is to evaluate all neighbors of the current 
labelling configuration and move to the best neighbor no matter if the 
likelihood improves or not. A neighbor of a labelling configuration e = 
(ei, 62 ,..., Cn) is defined as the one with only one entry being different. 
Thus, if e' and e are neighbors, then there exists some 1 < i <n such that 
ei / e(, but otherwise ej = e'- for all j ^ i. Given n nodes and K clus¬ 
ters, one labelling configuration has n{K — 1) neighbors. The steps of the 
algorithm are as follows. 

1. Start with r = 0. 

2. Repeat the following steps until convergence, or for a fixed number of 
steps. 

(a) Given a labelling configuration and parameter 0^”^ estimated 
under calculate the likelihood of all neighboring configura¬ 
tions, using the same parameter estimate, 0 ^”^. 

(b) Let be the neighbor that gives the largest likelihood. 

(c) Re-estimate the parameters using and denote the result 

by 0 (”+i). 
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3. Choose the best configuration among 6 ^*^^ .... 

We use the result from the EM algorithm as the starting point to run 
the Plus algorithm. The Plus algorithm converges when there exists a set of 
conhgurations 61 , 62 , ..., 6 g such that 61 is the best neighbor of 62 , 62 is the 
best neighbor of 63 , ..., and eg is the best neighbor of 61 . Often, this happens 
for q = 2, but sometimes it can happen for q > 2. Note that, while 
in step ( 2 b) gives the largest likelihood among all neighbors of 6 ^^\ it may 
still give a smaller likelihood than does 6 ^’’) itself, but the Plus algorithm 
“accepts” nonetheless. This is the main reason why the Plus algorithm 

can help the EM algorithm avoid local optima. On the other hand, the Plus 
algorithm itself moves very slowly — in any given iteration, only one node 
label is changed, so it is quite inefficient to use it as a standalone algorithm, 
but we have found it to work well as a complement to the EM algorithm. 

5. Application to NBA data. In this section, we apply our Continuous¬ 
time Stochastic Block Model (CSBM) to a few NBA basketball games that 
we have annotated ourselves. The games are: the 2012 NBA eastern confer¬ 
ence finals between the Miami Heat and the Boston Celtics, games 1 and 5; 
and the 2015 NBA finals between the Cleveland Cavaliers and the Golden 
State Warriors, games 2 and 5. Eor each game, we only consider the hrst 
three quarters to avoid having to deal with garbage time or irregular playing 
strategies (such as committing fouls on purpose), which are both common 
in the last quarter. In Section 5.1, we present some further model simplifi¬ 
cations and corresponding adjustments to the EM"*" algorithm. In Sections 
5.2 and 5.3, we present results for 2012 games between the Heat and the 
Celtics, and those for the 2015 games between the Cavaliers and the War¬ 
riors, respectively. In Section 5.4, we compare the 2012 Miami Heat with the 
2015 Cleveland Cavaliers, while paying special attention to the performance 
of LeBron James as he played with these two different teams in those two 
series. 

5.1. Model simplifications and adjustments of the EM'^ algorithm. In 
practice, the general model is complex, with K{K -|- |A|) rate functions to 
estimate. Eor applications to NBA data, we further simplify the general form 
by defining 

(27) Pki{t) = \k{t) ■ Pkh 

(28) r]ka{t) = \k{t) ■ Pka, 

such that Afc(f) is the rate function of the ball leaving a player in group k; 
Pki and Pka are transition probabilities that the ball goes to group I and 
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absorbing state a, respectively. The transition probabilities are subject to 
the constraint 

K 

(29) Pka = 1, for any /c = 1, 2 ,..., /sT. 

l=l a&A 


By making such simplifications, we assume that, whenever the ball leaves 
cluster k, the rates to other clusters and absorbing states are formed by a 
common rate and proportionality constants. In reality, the transition prob¬ 
abilities may change over time, but we believe that the simplified model 
still contains sufficient information to cluster players and reveal important 
patterns. The results in next section provide convincing evidence. 

The rate function simplifications lead to modifications in the EM"*" algo¬ 
rithm. Recall that for the general model, we update the marginal and initial 
probabilities by (25) and (26), respectively, in the M-step. Meanwhile, we 
update the rate functions by the quasi-Newton method. Under the simplified 
model, (25) and (26) still apply because the marginal and initial probabili¬ 
ties remain unchanged. Nevertheless, the K{K + |M|) rate functions reduce 
to K rate functions and a K x [K + |M|) transition matrix. We still adopt 
quasi-Newton for the rate functions, yet we have closed-form solutions for 
the transition probabilities (details in Appendix B.3), 


(30) Pki = 


(31) 


T; 0*] 

mij) 

E?.iEfi (e 

'zikI{GY\tih) > 0) 

T;0* 

■ Xk{t)dt) + Ck 


Er=i (^[zik\T-,Q*]-mia) 

Er=iEf=i {B[z,k\T;e*]-j;p\k{t)dt)+Ck 


for /c, / = 1, 2,..., R and a £ A. The parameter (k is the Lagrange multiplier, 
which can be easily solved by finding the root of Pki + Sae^t Pka = 1 
with the R function uniroot. 

For this simplified model, all probability parameters including marginal 
probabilities vr^, initial probabilities Psk and transition probabilities Pki 
and Pka have closed-from updates. Hence, to make the EM"*" algorithm 
more efficient, we partition the parameter set 0 into two groups: Qfast = 
{Tt'k, Pski Pkh Pka}, Consisting of all parameters with closed-form updates, 
and &siow = {-^fc(i)}) consisting of all parameters that we must update 
with quasi-Newton. Instead of updating all 0 only in the M-step of the 
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EM algorithm and Step (2c) of the Plus algorithm, parameters belonging 
to Ofast are always updated instantaneously “on the fly” — meaning that 
they are updated whenever there is a change in Z or the cluster labels e. 
More specifically, Gfast are updated when calculating each likelihood func¬ 
tion in the Gibbs sampler (24) of the EM algorithm and Step (2a) of the 
plus algorithm. 

5.2. Miami Heat versus Boston Celtics in 2012. In the 2012 NBA eastern 
conference hnals, eleven players from the Heat and ten players from the 
Celtics played in the first three quarters of their 1st and 5th games. We 
omit two Celtics players, Ryan Hollins and Marquis Daniels, because they 
each touched the ball only once in those quarters. The data, which have been 
illustrated in Table 2, consist of 283 plays (142 for the Heat and 141 for the 
Celtics) and 1205 transactions (657 for the Heat and 548 for the Celtics). We 
fit three different CSBMs — one to the Heat’s transactions alone, one to the 
Celtics’ transactions alone, and another one to transactions from both teams 
pooled together. In what follows, we discuss in detail our clustering results, 
initial probability estimates, fitted rate functions, and transition probability 
estimates. Given our data size (11 Heat players and 8 Celtics players), we 
picked a moderate number of clusters {K = 3). In practice, since the main 
purpose of our model is to cluster players and narrow down the search space 
for basketball scouts, the choice of K will mostly depend on the size of the 
basketball network and how elaborate one wants the clustering results to 
be. 

Clustering results. The cluster labels for the players are reported in Table 
3. Recall that basketball players play in hve different positions: point guard 
(PC), shooting guard (SC), small forward (SF), power forward (PF) and cen¬ 
ter (C). Generally speaking, the heights of the players are PG<SG<SF<PF<C. 

Considered separately, players in the two teams are clustered in similar 
manners. Point guards are in cluster 1; two perimeter players — {Wade, 
James} from the Heat and {Allen, Pierce} from the Celtics — are in cluster 
2; and the other players are in cluster 3. Roughly speaking, players with 
similar heights and close positions are clustered into the same group. Point 
guards certainly play in a different style than those of power forwards and 
centers. Shooting guards and small forwards are both perimeter players and 
often play in similar styles. In our case, Wade, James, Allen and Pierce are 
different than the other perimeter players, because they are stars. They have 
extraordinary offensive skills, so they can carry the ball longer and shoot 
more often. By contrast, the shooting guards and small forwards in cluster 
3 play without the ball for most of the time. 
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Table 3 

Clustering results for the 2011-2012 Miami Heat and Boston Celtics (K = 3^. Cluster 
labels are Cl, C2, C3. Three different clustering results are presented (two under 
“Alone” and one under “Together”). Player positions are included for reference only; 
they are not used by the clustering algorithm. 





Alone 


Together 

Team 

Player 

Position 

Cl 

C2 

C3 

Cl 

C2 

C3 


Mario Chalmers 

PC 

X 



X 




Norris Cole 

PC 

X 



X 




Dwyane Wade 

SC 


X 



X 



LeBron James 

SF 


X 



X 



James Jones 

SC 



X 



X 

Heat 

Shane Battier 

SF 



X 



X 


Mike Miller 

SF 



X 



X 


Chris Bosh 

PF 



X 



X 


Udonis Haslem 

PF 



X 



X 


Ronny Turiaf 

C 



X 



X 


Joel Anthony 

C 



X 



X 


Raj on Rondo 

PC 

X 



X 




Keyon Dooling 

PC 

X 



X 




Ray Allen 

SC 


X 



X 


Celtics 

Paul Pierce 

SF 


X 



X 


Mickael Pietrus 

SF 



X 



X 


Brandon Bass 

PF 



X 


X 



Kevin Garnett 

C 



X 



X 


Greg Stiemsma 

C 



X 



X 


When the two teams are pooled together, only one player (Brandon Bass 
from the Celtics) switches from cluster 3 to cluster 2. Actually, he is a “mini” 
PF, who has a typical PF’s weight and strength but the height of an SF, 
so his playing style is in between those of a typical SF and a typical PF. 
When compared only with other Celtics players, he is more similar to those 
in cluster 3. However, when players from the Heat also are included in the 
comparison, he starts to look more similar to LeBron James (a strong SF) 
and very different than those in cluster 3 who are on the Heat, e.g., in terms 
of rebounding, cutting, post playing, so he is re-clustered into cluster 2. 

In our subjective assessment, players in cluster 1 tend to dribble the ball 
a lot but do not shoot very often, those in cluster 2 both carry and shoot the 
ball, whereas those in cluster 3 are mostly responsible for catching rebounds 
and shooting, but not so much for carrying the ball. In what follows, we will 
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see these differences of the three clusters reflected in the different parameters 
of the CSBM. 

Initial probabilities. Table 4 displays the estimated transition probabilities 
from each initial action to the three clusters. Most inbounds go to point 
guards, because they usually are the ones to carry the ball from the back 
court to the front court. The Heat inbound more often to cluster 2 than the 
Celtics do, because LeBron James (in cluster 2) sometimes plays like a point 
guard. More than half of the rebounds are caught by cluster 3, the tall play¬ 
ers. For the Celtics, their cluster 1 players catch almost as many rebounds 
as those in their cluster 2, because the starting point guard, Rajon Rondo 
(in cluster 1), is an excellent rebounder. Regarding steals (a relatively rare 
event), the three clusters contribute equally within the Heat but somewhat 
differently within the Celtics. 


Table 4 

Estimated transition probabilities (Psk) from each initial action to clusters Cl, C2, C3, 
for three different clustering models of the 2011-2012 Miami Heat and Boston Celtics. 




Cl 

C2 

C3 


Inbound 

0.716 

0.194 

0.090 

Heat 

Rebound 

0.109 

0.375 

0.516 


Steal 

0.333 

0.333 

0.333 


Inbound 

0.868 

0.059 

0.073 

Celtics 

Rebound 

0.188 

0.208 

0.604 


Steal 

0.375 

0.500 

0.125 


Inbound 

0.793 

0.133 

0.074 

Together 

Rebound 

0.143 

0.357 

0.500 


Steal 

0.364 

0.454 

0.182 


Rate functions. Figure 6 contains the htted rate functions {Afc(t) : k = 
1, 2, 3} for the ball leaving a player in group k. Overall, these functions are 
quite different for the three clusters. For the same cluster, the rate functions 
from different teams are similar in general, but have considerable differences 
at certain time points. Below, we compare the patterns of the rate functions 
over four distinct time periods: t G (0, 5), t G (5,10), t G (10,15), and t > 15. 

At the beginning of a play, it usually takes about five seconds for a point 
guard to dribble the ball from the back court to the front court. Players 
in cluster 2 sometimes do that instead of point guards. Therefore, Ai(t) 
and A 2 (t) are low for t G (0,5). However, for both teams their Xsit) has a 
high and sharp peak around t ~ 2, because players in cluster 3 often catch 













25 




>-2(t) 







Fig 6. Fitted rate functions for the 2011-2012 Miami Heat and Boston Celtics, Xi{t), \ 2 {t) 
and Xsft), each describing the rate with which the ball leaves a player in cluster 1, cluster 
2 and cluster 3, respectively. 


rebounds and start new plays by quickly passing the ball to those in the 
other two clusters. 

After the ball arrives at the front court, the players spend about 5 seconds 
to settle down to their offensive layout. During this time period, i.e., t G 
(5,10), the two teams have different strategies. For the Heat, the point 
guards usually pass the ball to either James or Wade and let them handle 
the ball, so we can see a small peak in the Heat’s Ai(t) function. For the 
Celtics, their point guards — especially Rondo — usually continue to hold 
the ball and organize the offense, so the Celtics’ Xi{t) function even declines 
a little right after t > 5. The two teams’ X^it) functions exhibit significant 
difference over this time period. For the Heat, their players in cluster 3 
mostly play as transit ports, i.e., they get the ball and pass it out soon. For 
the Celtics, their players in cluster 3 — especially Kevin Garnett — have 
more opportunities to handle the ball. That is why in the right panel, the 
Heat’s A 3 (t) function has a peak around t « 7, while the Celtics’ A 3 (t) has 
a local minimum between 5 < t < 6. 

For t G (10,15), if the play still keeps going, players start to pass the ball 
more frequently and seek scoring opportunities. This is indicated by higher 
values in Ai(t) and A 2 (t) as well as a local peak in A 3 (t) on t G (10,15). 
During this time period, both teams play in a similar style, and their rate 
functions almost overlap. 

Due to the 24-second time limit for each play, both team increase their 
offensive pace after t > 15. However, when time reaches about t ~ 20, the 
two teams start to show highly distinctive playing patterns. For the Heat, 
all three of their rate functions rise rapidly, which means that all of their 
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players tend to release the ball quickly, either passing it on to others or 
shooting. For the Celtics, their A 2 (t) and X^it) also rise, but not as much 
as those of the Heat. The Celtics appear to play with more patience. An 
unusual phenomenon is that, for the Celtics, their rate function Ai(t) actu¬ 
ally decreases after t > 17. This is because the starting point guard, Rajon 
Rondo (in cluster 1), is not the best jump shooter. Close to the end of the 
time limit and against the tough defense from the Heat, he typically strug¬ 
gles a bit trying to pass or shoot, so the ball stays in his hands for a little 
longer. 

In Appendix C, we provide an expanded version of Figure 6 which includes 
95% confidence bands for these rate functions. 

Transition probabilities. The estimated transition probabilities for events 
originating from the three different clusters are presented in Table 5. We will 
focus on the transition probabilities of each team alone. When the two teams 
are pooled together, the estimated transition probabilities simply appear to 
be averages of the individual team results. 

Table 5 

Estimated transition probabilities (Pki and Pka) for the 2011-2012 Miami Heat and 
Boston Celtics (K = 3). Rows are originating clusters and columns are receiving clusters 

and play outcomes. 




Cl 

C2 

C3 

Make 2 

Miss2 

Make 3 

Miss3 

Fouled 

TO 


Cl 

0 

0.564 

0.296 

0.035 

0.014 

0 

0.042 

0.021 

0.028 

Heat 

C2 

0.188 

0.262 

0.225 

0.103 

0.087 

0.008 

0.032 

0.063 

0.032 


C3 

0.226 

0.426 

0.090 

0.052 

0.064 

0.039 

0.064 

0.013 

0.026 


Cl 

0.175 

0.332 

0.327 

0.031 

0.083 

0.010 

0.016 

0.005 

0.021 

Celtics 

C2 

0.270 

0.066 

0.262 

0.065 

0.172 

0.033 

0.066 

0.041 

0.025 


C3 

0.304 

0.177 

0.094 

0.191 

0.149 

0 

0.014 

0.057 

0.014 


Cl 

0.119 

0.479 

0.250 

0.032 

0.053 

0.006 

0.026 

0.012 

0.023 

Together 

C2 

0.220 

0.210 

0.211 

0.097 

0.124 

0.014 

0.039 

0.056 

0.029 


C3 

0.262 

0.341 

0.083 

0.110 

0.087 

0.023 

0.045 

0.030 

0.019 


First, we look at passes among clusters. For the Heat, James and Wade 
(both in cluster 2) are the absolute key players for the team, so players from 
both cluster 1 and cluster 3 tend to pass the ball to them (cluster 2) with 
very high probabilities (56.4% and 42.6%, respectively). James and Wade 
also pass the ball more often to each other than to the other clusters (26.2% 
vs. 18.8% and 22.5%, respectively). The two players in cluster 1, Chalmers 
and Cole, do not pass to each other in our data because they are never on the 
court at the same time during those games. The Celtics, on the other hand, 
tend to move the ball more evenly among the three clusters. Their clusters 
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1 and 2 each has almost equal probabilities to pass the ball to the other two 
clusters. Their transition probabilities are lower within each cluster than 
between different clusters. 

Next, we discuss shooting choices. For the Heat, the overall probabilities 
of shooting the ball (sum of Make 2, Miss 2, Make 3, and Miss 3) are 9.1% 
for cluster 1, 23% for cluster 2, and 21.9% for cluster 3. Meanwhile, the cor¬ 
responding numbers for the Celtics are 14.0% for cluster 1, 33.6% for cluster 
2, and 35.4% for cluster 3. Relatively speaking, when releasing the ball, 
the Heat players have lower chances to take a shot than the Celtics play¬ 
ers do, but higher chances to pass the ball to their teammates. This shows 
the offense of the Heat involves more interactions among players. For both 
teams, the respective shooting probabilities for clusters 2 and 3 are more 
than twice as high as those for cluster 1. Let us look into these probabilities 
in even greater detail. James and Wade (cluster 2, Heat) shoot many more 
2-pointers than 3-pointers, and incredibly, they score more than half of their 
2-pointer shots. Indeed, James and Wade are outstanding at penetration, but 
not great 3-point shooters. By contrast. Pierce and Allen (cluster 2, Celtics) 
are better balanced. They shoot and make more 3-pointers than James and 
Wade do. In the offensive end, Pierce has been regarded as one of the most 
well-rounded players (as of 2012), because of his ability to score from almost 
any location. Allen is an extraordinary 3-point shooter — actually one of the 
best in the entire NBA history. Unfortunately, Pierce and Allen miss many 

2- pointers in these two games. For the Heat, both their cluster 1 and cluster 
3 shoot many 3-pointers (almost as many as 2-pointers), since one of their 
main strategies is for James and Wade to attract the defense from their 
opponents while their other players seek open-shot opportunities (mostly 

3- pointers). For the Celtics, their clusters 1 and 3 mostly shoot 2-pointers, 
and their main attacking areas are close to the hoop. 

Finally, we examine the probabilities of drawing a foul and committing a 
turnover. Note that “drawing a foul” means being fouled by the opposing 
team, often after fooling them with fake moves. For the Heat, James and 
Wade draw fouls with much higher probability than do their teammates in 
cluster 1 and cluster 3 (6.3% vs. 2.1% and 1.3%). The reason is that James 
and Wade are often the ones to penetrate, while their teammates usually 
play “catch and shoot”. For the Celtics, players in their cluster 3 have the 
highest probability of drawing fouls, because those players — for example, 
Kevin Garnett — are very aggressive when playing close to the hoop; players 
in their cluster 2 are also good at drawing fouls, as Pierce is a master at 
doing so. Overall, the Celtics are more capable of drawing fouls, but they 
make fewer turnovers than the Heat, because they play at a slower pace and 
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make fewer passes. 

5.3. Cleveland Cavaliers versus Colden State Warriors in 2015. We now 
analyze two games in the 2015 NBA finals between the Cleveland Cavaliers 
and the Golden State Warriors — in particular, games 2 and 5. Again, 
we consider only the first three quarters. These two games are particularly 
interesting case-study materials for us because there was a fascinating change 
in the Warriors’ lineup in between. After losing both games 2 and 3 of the 
series, Steve Kerr, the head coach of the Warriors, decided to change their 
regular lineup to a small lineup, which meant that they stopped playing 
centers. This was an unconventional strategy but it successfully turned the 
series around, and the Warriors went on to win the championship that year 
by winning three consecutive games! 

These two teams have very different styles of play to start with. The 
aforementioned change in the Warriors’ lineup meant there was a big change 
in how the two teams played these two particular games as well. Thus, 
unlike in the previous section, in this section we simply fit four CSBMs 
separately for each team and each game, and no longer fit a pooled model 
combining the two teams and the two games together. Overall, there are 
four data sets. For game 2, the Cavaliers have eight players, 84 plays and 
290 transactions, while the Warriors have ten players, 75 plays and 307 
transactions. For game 5, the Warriors have ten players, 79 plays and 296 
transactions, whereas the Cavaliers have eight players, 81 plays and 291 
transactions. As in the previous section, in what follows we give detailed 
discussions about the clustering results, initial probability estimates, fitted 
rate functions, and transition probability estimates, in that order. 

Clustering results. The cluster labels of the players for the two games are 
reported in Table 6. As in the previous section, we set K = 3 here as well. 

For the Cavaliers, the results from the two games are similar, except their 
two shooting guards — Iman Shumpert and J.R. Smith — switch clusters. 
It is not surprising that LeBron James is in a cluster by himself. In these 
two games, he is the only core player of the Cavaliers since their other two 
superstars, Kyrie Irving and Kevin Love, are both absent due to injuries. 
Without support from other superstar teammates, James has to take charge 
of a large amount of ball handling, passing and scoring; he simply does it 
all. Indeed, James is one of the most versatile players in the history of the 
NBA. With James being the only primary ball handler of the Cavaliers, 
their cluster 2 consists of secondary ball handlers: the point guard, Matthew 
Dellavedova, for both games; and a shooting guard — Shumpert for game 
2 and Smith for game 5. In general, both Shumpert and Smith can dribble 
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Table 6 

Clustering results for the 2014-2015 Cleveland Cavaliers and Colden State Warriors 
(K = 3). Cluster labels are Cl, C2, C3. Four different clustering results are presented 
(two teams x two games). Player positions are included for reference only; they are not 

used by the clustering algorithm. 


1 1 


1 Game 2 

Game 5 | 

1 1 


1 Alone 

Alone 1 

1 Team | 

Player 

1 Position Cl | C2 | C3 

Cl 1 C2 1 C3 1 


Cavaliers 

Matthew Dellavedova 
Iman Shumpert 
.J.R. Smith 

LeBron James 

James Jones 

Mike Miller 

Tristan Thompson 
Timofey Mozgov 

PC 

SC 

SC 

SF 

SF 

SF 

PF 

C 

X 

X 

X 

X 

X 

X 

X 

X 

X 

X 

X 

X 

X 

X 

X 

X 


Stephen Curry 

PC 

X 



X 




Shaun Livingston 

PC 

X 



X 




Klay Thompson 

sc 


X 



X 



Leandro Barbosa 

sc 


X 



X 



Harrison Barnes 

SF 



X 


X 


Warriors 

Andre Iguodala 

SF 

X 





X 


Draymond Green 

PF 

X 





X 


David Lee 

PF 

Did 

Not 

Play 


X 



Andrew Bogut 

C 



X 

Did 

Not 

Play 


Festus Ezeli 

C 



X 

Did 

Not 

Play 


Marreese Speights 

c 



X 

Did 

Not 

Play 


and shoot. Shumpert handles the ball more often than does Smith in game 
2, but their roles are reversed in game 5. Other than Smith (in game 2) 
and Shumpert (in game 5), their cluster 3 consists of {James Jones, Mike 
Miller}, both catch-and-shoot players, and {Tristan Thompson, Timofey 
Mozgov}, both inside (the paint) players. Overall, the Cavaliers are a team 
built around a single key player, LeBron James. 

The Warriors, on the other hand, play the two games in fairly different 
styles. First of all, the active rosters are different: all three centers — An¬ 
drew Bogut, Festus Ezeli and Marreese Speights — play in game 2 but 
not in game 5; meanwhile, David Lee does not play in game 2, but does 
play in game 5. We already explained the reason behind these changes in 
their lineup at the beginning of this section (Section 5.3). Beyond the clear 
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change of rosters, our CSBM reveals more insight into the different playing 
styles of the Warriors in these two games. Unlike the Cavaliers, the War¬ 
riors have 4 primary ball handlers and distributors: Stephen Curry (PC), 
Shaun Livingston (PC), Andre Iguodala (SF) and Draymond Green (PF). 
In game 2 under their regular lineup, our model clusters these four players 
together. The two shooting guards, Klay Thompson and Leandro Barbosa, 
are clustered in one cluster. The three centers together with a small forward, 
Harrison Barnes, form the last cluster. In game 5 under their small lineup, 
our model divides their 4 primary ball handlers into two clusters — the two 
point guards, Curry and Livingston, are in one cluster; the two forwards, 
Iguodala and Green, are in another. All remaining players are in a separate 
cluster. Note that, although both Barnes and Lee are forwards, their roles in 
the team are considerably less important than those of Iguodala and Green. 


Table 7 

Estimated transition probabilities (Psk) from each initial action to clusters Cl, C2 and 
C3, for four different clustering models of the 2014-2015 Cleveland Cavaliers and Golden 

State Warriors. 




Cl 

C2 

C3 

Cavaliers 
Game 2 

Inbound 

0.489 

0.422 

0.089 

Rebound 

0.265 

0.088 

0.647 

Steal 

0.200 

0.400 

0.400 

Cavaliers 
Game 5 

Inbound 

0.500 

0.409 

0.091 

Rebound 

0.429 

0.107 

0.464 

Steal 

0.286 

0.428 

0.286 

Warriors 
Game 2 

Inbound 

0.767 

0.093 

0.140 

Rebound 

0.400 

0.160 

0.440 

Steal 

0.714 

0.143 

0.143 

Warriors 
Game 5 

Inbound 

0.660 

0.140 

0.200 

Rebound 

0.185 

0.296 

0.519 

Steal 

0.250 

0 

0.750 


Initial probabilities. The estimated transition probabilities from each initial 
action to the three clusters are shown in Table 7. 

For the Cavaliers, the probabilities of the two games are similar, except 
the rebounds of LeBron James (the only player in cluster 1). James catches 
many more rebounds in game 5 than he does in game 2 (42.9% vs. 26.5%). 
The reason here is that, with the Warriors playing the small lineup, James 
becomes one of the tallest and biggest men on the court, playing closer to 
the rim and catching more rebounds. For both games, more than 90% of the 
inbounds go to cluster 1 and cluster 2, with cluster I receiving slightly more 
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than cluster 2. Players in cluster 2 contribute more than 40% of the steals 
in the two games, while the other two clusters split the remainder. 

For the Warriors, recall that their three centers, belonging to cluster 3 in 
game 2, do not play in game 5, and their two forwards, Iguodala and Green, 
belonging to cluster 1 in game 2, become the new cluster 3 in game 5. As 
a result, their inbound probabilities change slightly, but their rebound and 
steal probabilities change dramatically. To get into more details, their players 
in cluster 1 have a much higher probability of receiving an inbound than 
those in the other two clusters combined, because their cluster 1 contains two 
point guards, Curry and Livingston. However, this probability goes down by 
about 10% from game 2 (76.7%) to game 5 (66%), whereas those of cluster 
2 and cluster 3 each increases about 5%. These results imply that, when the 
Warriors switch to their small lineup in game 5, players other than those 
in cluster 1 also get more opportunities to receive inbounds and initiate 
plays. In game 5, due to the absence of centers, who make up cluster 3 and 
contribute 44% of the rebounds in game 2, all players start to share their 
contributions to catching rebounds as well. In particular. Green and Iguodala 
(in cluster 3 for game 5) now catch 51.9% of the rebounds, in contrast to 
< 40% when they are in cluster 1 for game 2; the contribution of cluster 2 
to rebounds increases from 16% in game 2 to 29.6% in game 5; and finally, 
without Green and Iguodala (now in cluster 3), the two point guards that 
remain in cluster 1 (i.e., Curry and Livingston) also manage to catch 18.5% 
of the rebounds. Regarding steals, the most significant changes are a huge 
decrease for cluster 1 (71.4% to 25%) and a huge boost for cluster 3 (14.3% 
to 75%). Once more, this is because Green and Iguodala have “moved” from 
cluster 1 to cluster 3; they both are top defenders who contribute to many 
steals. 

Rate functions. The fitted rate functions of the Cavaliers and the Warriors 
are displayed in Figure 7 and Figure 8, respectively. 

For the Cavaliers, the rate functions from the two games appear to be 
generally similar for each respective cluster, with some small differences. 
For cluster 1 (James), its rate function Xi{t) is almost the same in the two 
games for t < 17 — fairly flat and low. This means that James plays with 
almost the same style at the beginning of a play in both games, keeping 
the ball in his hands and organizing the offense. Toward the end of a play, 
James starts to “heat up” at around t ~ 17 in game 2, whereas he does so 
slightly later in game 5, at about t ~ 19. This is because the small lineup of 
the Warriors in game 5 move much more quickly, so they can defend James 
more effectively in the last few seconds and delay his offense. For cluster 2, 
the first big difference appears after t > 7. In game 2, X 2 {t) grows slowly to 
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^l(t) /^2(t) ?^3(t) 





Fig 7. Fitted rate functions for the 2014-2015 Cavaliers, Xi{t), ^ 2 ( 1 ) and X 3 {t), each 
describing the rates with which the ball leaves a player in cluster 1, cluster 2 and cluster 
3, respectively. 


reach a peak at t ~ 14; however, in game 5, the same function \ 2 {t) grows 
rapidly after t > 7 and maintains a high level until t ~ 14. Clearly, players 
in cluster 2 have increased their offensive pace in game 5. On the one hand, 
Smith (cluster 2 SG in game 5) does more quick-release shooting than does 
Shumpert (cluster 2 SG in game 2). On the other hand, the higher defensive 
pressure created by the Warriors’ small lineup has forced the Cavaliers to 
move the ball more quickly. For the same reasons, toward the end of a play, 
players in cluster 2 also tend to attack the rim or pass the ball slightly 
earlier in game 5 (at t ~ 16) than they do in game 2 (at t ~ 18). For cluster 
3, their rate function Xsit) displays a similar pattern in the two games, 
but the one in game 5 is almost entirely dominated by the one in game 2. 
Players in this cluster are big men and typically catch-and-shoot players; 
they are usually not responsible for handling the ball. They catch rebounds 
and start a play by passing the ball to their teammates in the other two 
clusters. At around t ~ 12, they get their hrst chance to touch the ball, 
when they either shoot or pass it back to the ball handlers. Their second 
chance to touch the ball happens near the end of a play, when they have 
to shoot rapidly. In game 5, the small lineup of the Warriors can quickly 
cover the open shots and “double up” to defend a big man in the paint, and 
that forces players in Cavaliers’ cluster 3 to keep the ball in their hands for 
a slightly longer period. This is why their A 3 (t) is lower in game 5 than in 
game 2. Overall, the patterns displayed in the Cavaliers’ three rate functions 
are quite similar in the two games. The changes mostly can be attributed 
to the different defensive strategies used by their opponent. 

For the Warriors, though, due to the change in their lineup, their rate 
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/^2(t) ?^3(t) 





Fig 8. Fitted rate functions for the 2014-2015 Golden State Warriors, Xi{t), \ 2 {t) and 
Xsit), each describing the rates with which the ball leaves a player in cluster 1, cluster 2 
and cluster 3, respectively. 


functions from the two games are noticeably different. In game 2 with their 
regular lineup, their rate functions (blue solid lines in Figure 8) show regular 
patterns — at the start of a play, Ai(t) and \ 2 {t) are relatively low, while 
Xz{t) has high peaks. This means that, at the start of a play, players in 
cluster 1 and cluster 2 tend to handle the ball, whereas those in cluster 3 
catch rebounds and pass the ball out more or less immediately. This is the 
same as the playing style of the Cavaliers. However, their rate functions 
have more peaks than those of the Cavaliers. Moreover, their Ai(t) and 
\ 2 {t) in game 2 are, in general, higher than those of the Cavaliers at the 
start of a play. These show that the Warriors’ offense is more flexible — 
the ball is passed more frequently, so everybody gets chances to touch it, 
and no one holds the ball for a very long time. In fact, this has become 
the Warriors’ signature team-playing style. However, in game 5, all three 
of their rate functions show significant differences. First, the two peaks of 
Ai(t) occur earlier in game 5 than in game 2. Second, the rapid growth 
of X 2 {t) also appears earlier in game 5 (at t « 17) than in game 2 (at 
t ~ 22). Both differences indicate that, with a small lineup, the Warriors 
have increased their offensive pace in game 5. Finally, their X^{t) changes 
dramatically between the two games; in game 5, it is much flatter at the 
beginning and has a much higher peak at t ~ 17. This is certainly because, 
in game 5, the players making up cluster 3 are entirely different from the 
ones in game 2. From Table 7, we know that their cluster 3 in game 5 (Green 
and Iguodala) catch a larger proportion of rebounds than do their cluster 
3 in game 2 (three centers), but instead of immediately passing the ball 
out, Green and Iguodala both often dribble and run the play. The peak 
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of A 3 (t) at f « 17 in game 5 is particularly significant, revealing one key 
offensive strategy of the Warriors’ small lineup, the so-called “high pick- 
and-roll”. A typical sequence of this strategy is as follows: Curry dribbles 
the ball outside the three-point line, and Green (or Iguodala) comes to set 
up a screen (a “human body wall”). Thanks to Curry’s incredible three- 
point shooting skills, after he dribbles around the screen both defenders of 
Curry and of Green (or Iguodala) usually have to focus on covering Curry 
together, leaving Green (or Iguodala) wide open, so Curry can now pass the 
ball to him. Green (or Iguodala) can then shoot the ball; drive to the basket 
directly; or take one or two dribbles, draw another defender, and then pass 
the ball to another wide-open teammate, who is usually waiting at the three- 
point line on the weakly-defended side. This entire sequence often happens 
very quickly within three seconds. 

Overall, the estimated rate functions reveal many intricate details of a 
team’s playing style. The Cavaliers play around their key superstar, LeBron 
James, whereas the Warriors share the ball more evenly. It also can be easily 
seen that the Warriors have played these two games quite differently and the 
Cavaliers have responded with small but clear adjustments in their playing 
style as well. 


Table 8 

Estimated transition probabilities (Pki and Pka) for the 2014-2015 Cleveland Cavaliers 
and Golden State Warriors (K = 3j. Rows are originating clusters and columns are 
receiving clusters and play outcomes. 




Cl 

C2 

C3 

Make 2 

Miss2 

Makes 

MissS 

Fouled 

TO 

Cavaliers 
Game 2 

Cl 

0 

0.167 

0.430 

0.111 

0.153 

0.014 

0.014 

0.069 

0.042 

C2 

0.292 

0.141 

0.259 

0.016 

0.081 

0 

0.114 

0.032 

0.065 

C3 

0.335 

0.160 

0.111 

0.098 

0.123 

0.037 

0.037 

0.062 

0.037 

Cavaliers 
Game 5 

Cl 

0 

0.384 

0.274 

0.123 

0.110 

0 

0.027 

0.027 

0.055 

C2 

0.296 

0.181 

0.261 

0.024 

0.036 

0.059 

0.107 

0 

0.036 

C3 

0.346 

0.198 

0.076 

0.061 

0.122 

0.045 

0.061 

0.061 

0.030 

Warriors 
Game 2 

Cl 

0.357 

0.233 

0.194 

0.037 

0.037 

0.007 

0.060 

0.030 

0.045 

C2 

0.380 

0 

0.120 

0.140 

0.080 

0.100 

0.120 

0.060 

0 

C3 

0.469 

0.226 

0 

0.061 

0.081 

0 

0.061 

0.061 

0.041 

Warriors 
Game 5 

Cl 

0.159 

0.276 

0.323 

0.058 

0.058 

0.046 

0.034 

0 

0.046 

C2 

0.151 

0.097 

0.285 

0.151 

0.166 

0.015 

0.015 

0.060 

0.060 

C3 

0.330 

0.267 

0.137 

0.064 

0.038 

0.025 

0.038 

0.076 

0.025 


Transition probabilities. The estimated transition probabilities of events 
originating from the three different clusters are displayed in Table 8, for 
both the Cavaliers and the Warriors. 
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For the Cavaliers, the overall probabilities to pass the ball (sum of the first 
three columns) for the three respective clusters are {59.7%, 69.2%, 60.6%} 
in game 2, and {65.8%, 73.8%, 62%} in game 5. Clearly, the Cavaliers make 
more passes in game 5 than in game 2, which is due to the stronger defense 
by the Warriors’ small lineup. For the same reason, in game 2 James (the 
only player in cluster 1) passes more to cluster 3 (shooters and big men), 
whereas in game 5 he passes more to cluster 2 (ball handlers). The respective 
roles of their cluster 2 and cluster 3 do not change much in the two games 
— cluster 2 is the bridge between cluster 1 and cluster 3, making almost 
an equal proportion of passes to each of the other two clusters; cluster 3, 
however, more often passes the ball back to James (cluster 1). The overall 
probabilities to shoot the ball (sum of columns 4-7) for the three respective 
clusters are {29.2%, 21.1%, 29.5%} in game 2, and {26%, 22.6%, 28.9%} in 
game 5, which do not change much. When facing the quick defense of the 
Warriors in game 5, the Cavaliers has successfully created an almost equal 
percentage of shots by making more passes. Regarding the probabilities of 
being fouled and making turnovers, James (cluster 1) fails to draw as many 
fouls in game 5 as he does in game 2 (2.7% vs. 6.9%), but he makes more 
turnovers (5.5% vs. 4.2%). These can be partly attributed, again, to the 
stronger defense by the Warriors’ small lineup, especially the one-on-one 
defense on James by Iguodala. Players in cluster 2 are not as aggressive in 
game 5 as they are in game 2 — although they make fewer turnovers (3.6% 
vs. 6.5%), they do not draw any fouls at all (0% vs. 3.2%). The performance 
of cluster 3 is fairly stable in the two games in terms of drawing fouls and 
making turnovers. 

For the Warriors, the overall passing probabilities of their three respective 
clusters are {78.4%, 50%, 79.5%} in game 2, and {75.8%, 53.3%, 73.4%} in 
game 5. Despite the drastic changes in their lineup, these probabilities do 
not change much. Each of the first three columns in Table 8 contains the 
probabilities that the corresponding cluster is the receiver of the ball passed 
from different clusters. Here, we can easily see that a considerable proportion 
of the passes have shifted from cluster 1 to cluster 3 in game 5. This is 
because Green and Iguodala, two of the four primary ball handlers, are now 
in cluster 3 as opposed to cluster 1, and they receive many passes. The 
overall shooting probabilities (sum of columns 4-7) for their three respective 
clusters are {14.1%, 44%, 20.3%} in game 2, and {19.6%, 34.7%, 13.5%} in 
game 5. In both games, players in cluster 2 are more likely to shoot than 
those in the other two clusters. This makes sense because cluster 2 contains 
two shooting guards, Klay Thompson and Leandro Barbosa, who both are 
excellent scorers and often take on a huge responsibility in shooting the ball. 
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It can also be seen that, in game 5, the probability to shoot has increased 
for cluster 1 but decreased for cluster 2. This is because the small lineup 
gives players in cluster 1 — especially Curry — more open space and hence 
better shooting opportunities; by contrast, Klay Thompson (cluster 2), who 
is less affected by the change in the lineup, struggles with shooting in game 
5. For cluster 3, we see that Green and Iguodala (cluster 3 in game 5) are 
less likely to shoot than the centers (cluster 3 in game 2). With regard 
to shooting, it is well-known that the Warriors rely on three-pointers as 
one of their most important scoring methods. Curry and Thompson are 
arguably the best three-point shooting back-court duo in the entire history 
of the NBA. From Table 8, we can clearly see that the Warriors attempt 
many more three-pointers than the Cavaliers do and they also succeed more 
often. One surprising observation is that players in their cluster 2 shoot 
considerably fewer three-pointers in game 5 than they do in game 2. Indeed, 
this is another piece of evidence showing the struggle of Klay Thompson in 
game 5. There are two significant differences in terms of drawing fouls and 
making turnovers: cluster I fails to draw any fouls in game 5 versus 3% in 
game 2; and cluster 2 makes more turnovers in game 5 than in game 2 (6% 
vs. 0%). 


5.4. LeBron James: Miami Heat versus Cleveland Cavaliers. Both the 
2011-12 Miami Heat and the 2014-15 Cleveland Cavaliers had LeBron James 
on their teams and made him the key player. Thus, it is especially interesting 
for us to compare the player structures of these two teams, and to see if there 
is any difference in how James has played the game with these different 
teams. We investigate the first question by pooling the transactions of the 
Heat (in their two 2012 games versus the Celtics) and the transactions of the 
Cavaliers (in their two 2015 games versus the Warriors), and applying the 
CSBM to cluster the players from both teams together. With regard to the 
second question, we simply compare the individual results we have obtained 
earlier for the Heat (Section 5.2) and for the Cavaliers (Section 5.3). 

For the pooled CSBM, we focus primarily on the clustering results in 
this section and forsake any detailed discussions of the rate functions or the 
transition probabilities. Other than LeBron James, Mike Miller and James 
Jones are also on both of these teams. When playing on different teams, 
the same player may play in a different style, depending on his specific role 
for the team. Hence for James (and likewise for Miller and Jones, too), we 
create two separate avatars — one for the games he played on the Heat 
and another for the games he played on the Cavaliers — and treat them as 
two different “players” in the clustering algorithm. We are especially curious 
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whether the pooled CSBM will cluster the two avatars of the same player 
into the same cluster or different clusters. 

Table 9 displays the clustering results from the pooled CSBM, fitted to all 
transactions of the Heat and the Cavaliers in the 4 games we have annotated. 
With a total of 19 “players”, we now choose K = A instead of ii' = 3 as we 
did in the previous two sections; this allows us to cluster the “players” with 
a slightly finer resolution. 

Our clustering results clearly indicate that the 2011-12 Heat and the 2014- 
15 Cavaliers are built in a very similar way. Cluster 1 consists of point guards; 
cluster 2 consists of superstars — namely, LeBron James (for both teams) 
and Dwyane Wade (for the Heat); cluster 3 consists of the other perimeter 
players — mostly shooters and perimeter defenders; and the last cluster is 
made up of big men — power forwards and centers. It also turns out that the 
two avatars of the same player (whether James, Miller or Jones) are always 
clustered together. Indeed, both teams are built around LeBron James and 
their playing styles are similar, too. James is the primary ball handler and 
distributor for both teams. While playing for the Heat, James has Wade 
as an important helper, but while playing for the Cavaliers, he is the only 
superstar. We can imagine that, if Kyrie Irving, the superstar point guard 
of the Cavaliers, were not injured, he might have joined James and Wade in 
cluster 2. The point guards in these two team are secondary ball handlers 
and serve as bridges between the superstars and the other players. Players in 
cluster 3 are mainly responsible for playing defense and “catch and shoot”. 
The big men in cluster 4 are mostly responsible for catching rebounds and 
scoring under the rim. 

In the rest of this section, we revisit some individual results for the Heat 
(Section 5.2) as well as for the Cavaliers (Section 5.3) in order to compare 
in more detail the performance of LeBron James in those two series. 

First, recall that our cluster labels (e.g.. Cl, C2, ...) are arbitrary, and 
that James has been clustered into C2 with the 2011-12 Heat but into Cl 
with the 2014-15 Cavaliers. Comparing Figure 6 (middle panel) and Figure 
7 (left panel), we find that the Heat’s X 2 {t) function has more peaks and 
is higher than the Cavaliers’ Xi{t) function overall. This shows that, while 
playing for the Heat, James chooses to pass the ball more often at the 
beginning of a play. This is mostly because of the presence of Wade, a 
superstar teammate, who interacts with James more frequently than the 
point guards. Actually, the “two-man fast break” by James and Wade is 
one of the Heat’s defining features. Second, comparing Table 4 and Table 7, 
we find that, while playing for the Heat, James and Wade together receive 
19.4% of the inbounds, whereas, while playing for the Cavaliers, James alone 


38 


Table 9 

Clustering results for the 2011-2012 Miami Heat and the 2014-2015 Cleveland Cavaliers 
together (K = 4^. Cluster labels are Cl, C2, C3, C4- Players appearing with two separate 
avatars for the clustering algorithm are bolded. Player positions are included for 
reference only; they are not used by the clustering algorithm. 


1 1 Together 

Team 

Player 

Position 

G1 

G2 

G3 

G4 


Mario Chalmers 

PC 

X 





Norris Cole 

PC 

X 





Dwyane Wade 

SG 


X 




LeBron James 

SF 


X 




James Jones 

SG 



X 


Heat 

Shane Battier 

SF 



X 



Mike Miller 

SF 



X 



Chris Bosh 

PF 




X 


Udonis Haslem 

PF 




X 


Ronny Turiaf 

C 




X 


Joel Anthony 

C 




X 


Matthew Dellavedova 

PC 

X 





Iman Shumpert 

SG 



X 



J.R. Smith 

SG 



X 


Cavaliers 

LeBron James 

SF 


X 



James Jones 

SF 



X 



Mike Miller 

SF 



X 



Tristan Thompson 

PF 




X 


Timofey Mozgov 

C 




X 


receives a staggering 50% of the inbounds. The Heat mostly let their point 
guards carry the ball past the half court, because they always have one of 
them (either Chalmers or Cole) on the court. However, with Irving out on 
injury, the Cavaliers only play one point guard (Dellavedova) in their lineup, 
so James has to carry the ball more than usual. Third, while on the Heat, 
James and Wade together average a 23% probability to shoot, but while 
on the Cavaliers, James alone has an even higher probability to shoot — 
29.2% and 26% respectively in the two games against the Warriors. James 
is a great scorer as well as offensive organizer. He can freely switch between 
these two modes of play depending on the situations in the game. While 
playing for the 2011-12 Heat, James has stronger teammates, so he tends to 
create more shooting opportunities for others. With the 2014-15 Cavaliers, 
however, James must take more shots by himself due to the limited support 
from his teammates. 




Table 10 

Summary of differences between our work and others. 
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Network 

Perspective 

Time 

Dynamics 

Model 

Objective 

CSBM 

yes 

yes 

descriptive at individual level 

Fewell et al. (2012) 

yes 

no 

descriptive at position level 

Cervone et al. (2016) 

no 

yes 

predictive at individual level 
(of final point outcome) 


In summary, our analysis using the CSBM shows that the player structure 
of the 2011-12 Heat and that of the 2014-15 Cavaliers are fairly similar. The 
CSBM also reveals many subtle differences in LeBron James’ playing style 
in the two series. 

6. Summary. In this paper, we advocate the concept that basketball 
games can be analyzed as transactional networks. We have proposed a 
Continuous-time Stochastic Block Model to cluster players based on their 
styles of handling the ball. In particular, we model each basketball play 
as an inhomogeneous continuous-time Markov chain, with transition rate 
functions being governed by the players’ cluster membership. We adopt B- 
splines to model the rate functions and an EM+ algorithm to estimate model 
parameters. Applications to a number of NBA games between the 2011-12 
Miami Heat and Boston Celtics and between the 2014-15 Cleveland Cava¬ 
liers and Golden State Warriors have yielded compelling evidence that the 
CSBM framework is of great practical value in clustering and evaluating 
basketball players. 

As the popularity of basketball analytics appears to be growing in recent 
years, it is perhaps helpful for us to summarize the main differences between 
our work and a few recent works in this area (e.g., Fewell et ah, 2012; Cervone 
et ah, 2016). The key features of our work are: (i) viewing basketball games 
from a network perspective, (ii) consideration of time dynamics, and (iii) 
clustering of players at an individual level. In what follows, we discuss how 
our work differs from a few others in terms of these features; a brief summary 
is given in Table 10. 

Fewell et al. (2012) certainly view basketball games from a network per¬ 
spective as well, but they do not take time dynamics into account, and their 
treatment of players occurs at a position level rather than an individual 
level. Specifically, they pre-group players according to their on-court posi¬ 
tions (e.g., point guard, shooting guard, and so on). Whereas our CSBM 
describes player differences based on the real-time dynamics of how each 
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basketball play unfolds, the method developed by Fewell et al. (2012) aims 
to describe differences in how each of the five pre-defined positions commu¬ 
nicates with each other — and with various initial and absorbing states — at 
an aggregate level, aggregated over both all players holding the same posi¬ 
tion and all transactions during a certain time period (e.g., an entire game). 
In their work, point guards are always considered together with other point 
guards, and any player difference at the individual level is suppressed. While 
it is hardly surprising that many players holding the same position often end 
up being clustered together by our CSBM, this is certainly not always the 
case. For example, our analysis of the two games between the 2014-15 Cleve¬ 
land Cavaliers and Golden State Warriors (Section 5.3) clearly shows that 
the distinctive playing style of LeBron James almost calls for the defini¬ 
tion/creation of a new on-court position, for which some long-time basket¬ 
ball observers have informally suggested the name of “point forward”. Our 
analysis also shows that players like Draymond Green (a power forward) 
and Andre Iguodala (a small forward) are certainly playing important roles 
in the game beyond the traditional ones defined by their respective on-court 
positions. 

Cervone et al. (2016), on the other hand, do consider time dynamics, but 
they do not view basketball games from a network perspective. While they 
track the movement of the ball both spatially and over time, they do not 
view players as nodes and passes as edges. Most importantly, their objective 
is fundamentally different from ours. Our goal is to cluster players according 
to their individual playing styles as characterized by the rate functions Xk{t) 
and the transition probabilities , Pki > and P^a j but theirs is to predict the 
final point value of each basketball play/possession as the individual play 
unfolds. One can say that their analysis is driven by outcome but ours is 
driven by style. Although rate functions for ball passing are components of 
both models, their structure and role vary considerably. Our rate functions 
are smooth functions of clock time, and are used to characterize groups 
of players with similar transition rates. Their rate functions are log-linear 
regressions which use predictors derived from motion-capture data, forming 
one component in a hierarchical model whose ultimate objective is to predict 
point value. They are not used to cluster players. 

Early works on the SBM (Snijders and Nowicki, 1997; Holland, Laskey 
and Leinhardt, 1983) are mostly concerned with static networks. Recently, 
Ho, Song and Xing (2011), and Xu and Hero (2014) have used dynamic 
SBMs to study social networks that evolve over time, but their works focus 
on discrete time dynamics and are thus not directly applicable to network 
transactions (such as basketball passes) that happen in continuous time. 
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Although we have focused on basketball games in this paper, one certainly 
can use our CSBM to analyze any other network where exchanges take place 
between its nodes in continuous time. 

APPENDIX A: SOME DETAILS ABOUT INHOMOGENEOUS 

POISSON PROCESSES 

Consider an inhomogenous Poisson process with rate function p{t). We 
derive the distribution of having m events arriving at time points ti < 
t 2 < ... < tm G [To,r], closely following the presentation by Cook and 
Lawless (2007, p. 30). Let Nt denote the number of events in the time interval 
[t, t + At). By the definition of the Poisson process, for a very small At, 

(32) V{Nt = 0) = 1-p{t)At + o{At), 

(33) V{Nt = 1) = p{t)At + o{At), 

(34) V{Nt > 2) = o(At). 

Consider a partition of [Tq, T), say Tq = uq < ui < U 2 ■ ■ ■ < ur = T. By the 

“independent increment” property of the Poisson process, we have 


R-l R-l 

(35) lP([ro,ri)) = n Vi[Ur,Ur+l)) = n 

r=0 r=0 

^ ( n - P{Ur)AUr + o{AUr)]j ■ (^ [/?(Ur) Au^ + o(Attr)]) • 

( n b(An.)]). 

Nur>‘i 

Notice that log[I — p{t)At] = —p{t)At + o(At), so the logarithm of the first 
product in (35) — the one over =0 — approaches the Riemann integral, 
— p{t)dt, in the limit. Thus, dividing Aur into each respective term that 
corresponds to the interval [ur,Ur+i) and taking the limit as i? —)■ oo and 
consequently as Au^ = Ur+i—Ur —)> 0, we obtain that the desired distribution 
is 

m 

Y[ p{ti) ■ exp [- p{u)du]. 
i=l “'^0 


APPENDIX B: SOME DETAILS FOR THE EM ALGORITHM 

B.l. The conditional expectation E[log A(T, Z)|T; ©*]. First, by 
(3), we have 


(36) E[log£(T,Z)|T;0*] 
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= E[log£(T|Z) + log£(Z)|T;0* 


= E[^^log£^(T,,|Z)+ 5] log£^nT*,|Z) 

sG(5 i=l 

n n 

+ ^ log£^2(Ti|Z) + E E ^ogC^iTM + log£(Z)|T; 0=* 

i=l 2=1 a^A. 

Now, we plug in (18), (19), (20), (21) and (22), and the respective terms in 
(36) are as follows. The part is equal to 


n 

(37) EEE[log£'(T,,|Z)|T;0 


sGtS 2=1 


K rris 


EEE[iogn(n(p.‘ 


1 \ ^ik 


(~<sih 

k=l h=l k 


T;0=' 


sGtS 2=1 

n K rrisi 

= E E E E [z^k • E (log Psk - log G|*^) |T; 0 =* 

sG(S 2=1 k=l h=l 

n K 

E E E (E[^ifc|T; 0*] • m,, log Psk] 


sGtS 2=1 k=l 


n K msi 

-EEi;E[g«. J^logGJ 

sGtS 2=1 k=l h=l 


sih 


T;0=' 


The part is equal to 


(38) E E[log£^KTii|Z) T;0* 

l<2^jl<r2 

K K / 


l< 27 ^ji<n 


1 \ 

T. E[iog n n n |t; e 


Ai=l 1=1 ^ h=l 


G- 


K K 


^ik^jl ■ 'y ( Pkl{tijh') log Gj , 

k=l1=1 


h=l 


T;0=' 


K K 


E EE(e 0 ] ‘ ^ ^ Pklij^ijh) 

l<i^j<n k=l 1=1 h=l 

K K rriij 

- E E£E[2«z,,.^logOf|T;e- 

l<i^j<nk=l 1=1 h=l 
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The part is equal to 
(39) 

n 

^E[log£^2(T,|Z)|T;0=' 

K , Mi 


i=l 


K 


C^ih 


^ik 


= ^E[logn ne^p(-E/_ Pkl{t)-IiGl^>0)dt)] T;0=' 

i=l k=l^h=l l=l'’^ih 

n K Mi K th 

= EEEhfcE(-E /_* >O)dt)|T;0=' 

T;0=' 


2=1 Al=l h=l 1 = 1 ^ih 

n K Mi K 


= -EEEEE[^.fc/_ Pki{t)-i{Gf>d)dt 

i=lk=lh=ll=l ■'^ih 

n K K M, 


(E[2.:U(Gf > 0) 


2=1 k=l 1 = 1 h=l 


T;0=' 


r^ih 


pkl 




where we have pulled the indicator term I{G'f‘ > 0) out of the integral in 
the last step of (39) because the quantity Gf^ is a constant on any 
as no player substitution can happen during that time. Finally, the part 
is equal to 


(40) 


T;0=' 


EEe[ log £‘^(Tia|Z) 

i=l aSA 

n ^ 

= EEe[ log n n PkaiUah) • n ( 

i=l aS.4 k=l ^ h=l h=l 

n K rriia Mi ^ 

= EEEe ^ik (E log Pkait iah) ^ L Vka{t)dtj^T]e 

h=l h=l ^ih 


tih s \ 

Pka{t)dt)) T;0* 


i=l a£A k=l 
n K 


^ Mj - 1 -.^ V 

E E E “ E /_ Pkait)dtjj, 


i=lae^A:=l ' h=l 

and the C{Z) part is equal to 


h=l ^ih 


(41) 


E 


log£(Z)|T;0=' 


n K 

=E[iognn^riT;e* 

i=l k=l 

n K 

= EE(E[^*fc|T;0*]-log7rfc). 

i=l k=l 
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B.2. Analytic updates of tt and P. In the conditional expectation of 
the log-likelihood (36), the term that contains P, the transition probabilities 
from initial actions to players in different groups, appears in (37). It is 


n K 


E E E (E[^ifc|T; 0*] • ms^ logPsk 


(42) 

i=l k=\ 
but there is a constraint 

K 

(43) E Psk = 1 for any s E 5. 

k=l 

Introducing Lagrange multipliers Csj for each s E 5, we get 

. n K K 

(44) ^ ^^(E[2ifc|T;0*]-m,aogPsfc) -Cs(E^*fc-l) 


s&S i=l k=l 


k=l 


Differentiating with respect to each Pgk and setting the the derivatives to 
zero, we get 


(45) 


Er=i (E[zik\T-,e*]-m,i) 


— (s = for s E 5 and k = 1,2,..., K. 


sk 


The constraint (43) implies 
(46) 


K n 

C« = EE(E[^ifc|T;0*]-m,O. 

k=li=l 


Hence, we obtain the updating equation (26): 

Er=i (E[zifc|T;0*]-m,O 


(47) 


Psk = 


EtiEU {nz,k\T;G* 


m. 


The updating equation (25) for (tti, 7r2,..., t^k) can be derived in a similar 
manner; the actual derivation is omitted. 


B.3. E[log /^(T, Z)|T; ©*] under model simplifications (27)-(28). 
In Section 5, we introduced further simplifications to our Continuous-time 
SBM, namely (27) and (28), before applying it to analyze basketball games. 
Here, we provide details about the changes to some of the components (37)- 
(41) for E[log£(T, Z)|T; 0*] as a result of these simplifications. The com¬ 
ponents (37) and (41) do not involve any rate functions, so they remain the 
same; whereas the components (38)-(40) now become 


(48) 


^ E[log£^HTi,|Z)T;0 











45 


K K niij 

E EE(e[ j ® ^ V loS“1“ 'kklij \o^ Pkl^ 

l<i^j<n k=l 1=1 h=l 

K K ruij 

- E EEEhzjrEi^eOflTie- 

l<i^j<n k=l 1=1 h=l 


i=l 


T;0=' 


(49) ^E[log£^2(T.|z) 

n K K Mi 

EELEfEk^Gf >O)T;0EPfcr I Xk{t)dt), 

i=ik=ii=ih=i ^ ^ ^ 


and 


n K 


i=l a&A 


(50) ^^ E[log£O(T,,|Z)|T;0*] = ^ ^ ^ (E[zifc|T; 0*]- 

i=l a^A k=l 

r^ih \ 

( log \k{tiah) + rriia log Pka - Pka'Y Afc(t)dt) j . 


h=l 


h=l ^ih 


B.4. Analytic updates of Pki,Pka under model simplifications 

(27)-(28). Recall that, under model simplifications (27)-(28), the constraint 
on these transition probabilities is given by (29): 


K 

(51) E + E k = 1,2,... ,K. 

/ = 1 cl^A 

Again, we introduce Lagrange multiplier for A: = 1,2,..., A. Combining 
the terms from (48)-(50) that involve these transition probabilities with the 
constraint above, we obtain the Lagrangian function. 


(52) 


K K 

E EE(e [zikZjilT] 0*] • rriij ■ logPki^ 

k=l 1=1 


K K Mi 


-EEEE E ZikliGf^ > 0) 


2=1 k=l 1=1 h=l 


T;0* 



n K 

+ EEE(e [Zik\T-, 0*] • {rriia ■ log Pka 

i=l a£A k=l 


Pka 'Y /_ ^kit)dt) 


h=l ^ih 


K . K 

“ E ■ ( E EfcZ + E Efca 

k=l ^ /=1 aG-4 
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Differentiating with respect to each P^i, Pka and setting the the derivatives 
to zero, we get 


(53) 


E 




{^[zikZji\T-,Q*] ■ rriij 


Pki 

n Mi 

-EE(Eh^(Gp>0) 

i=l h=l 


T;0=' 


Xk{t)dt^ - = 0, 


and 


(54) 


EEi (B[zik\T;Q*]-m, 


Pi 


ka 


n Mi 

^ ^ (E[zik\T; 0*] • r Xk{t)dt) - a = 0, 


i=l h=l 

from which we can solve for the transition probabilities: 


(55) Pki = 


{^[ZikZjl\T^'-, 0 ] • Ttlij^ 


EEi Ef=i {P‘[z^kI{Gf > 0)|T; 0*] • jp Xk{t)dt) + Ck 


(56) 


Pka = 


EEi (E[zifc|T;0*]-mia) 


EEi Efdi (E[z,fc|T; 0*] • Xkmt) + Ck 


for k,l = 1, 2,..., iii and a € A. Each Lagrange multiplier (^k can be solved 
numerically as the (univariate) root to the equation Ez^i Pki+J2a&A Pka = 1 
for each k. We do this with the R function uniroot. 


APPENDIX C: CONFIDENCE BANDS FOR ESTIMATED RATE 

FUNCTIONS 

It is possible to obtain confidence bands for the estimated rate functions 
conditional on the cluster labels by calculating the pointwise standard er¬ 
rors using the observed Fisher information matrix and the standard Delta 
method. As an example, rate functions displayed on top of each other in 
Figure 6 (to facilitate side-by-side comparison in Section 5.2) are now dis¬ 
played individually in Figure 9 with their respective 95% confidence bands. 
In all panels of Figure 9, we can see that, as t —)• 24, the confidence inter¬ 
vals invariably widen. This is because there are fewer transactions as the 
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time approaches the end limit for each play, since many plays end before 
reaching the full 24-second limit. Elsewhere, these confidence intervals are 
narrow enough to suggest that features identified in Figure 6 and discussed 
in Section 5.2 are unlikely to be merely artifacts due to noise in the data. 


Heat 



Fig 9. Rate functions displayed on top of each other in Figure 6 are displayed here indi¬ 
vidually with 95% pointwise confidence bands. 
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